🔥 Complete PyTorch Tutorial

From Fundamentals to Advanced Deep Learning (With Theory)

1. Introduction to PyTorch

PyTorch is an open-source deep learning framework developed by Meta (Facebook). It is widely used in research and industry due to its flexibility and ease of debugging.

Key Characteristics:

2. Installation & Setup

PyTorch can be installed using pip. It automatically detects CPU or GPU support.

pip install torch torchvision torchaudio import torch print(torch.__version__)

3. Tensors (Core Data Structure)

A tensor is a multi-dimensional array, similar to NumPy arrays, but with GPU acceleration and automatic differentiation support.

x = torch.tensor([1, 2, 3]) y = torch.rand(2, 3) z = torch.zeros(3, 3)

Tensors support mathematical operations and broadcasting.

4. Autograd – Automatic Differentiation

Autograd is PyTorch’s automatic differentiation engine. It tracks operations on tensors and computes gradients during backpropagation.

x = torch.tensor(2.0, requires_grad=True) y = x ** 3 y.backward() print(x.grad)

This mechanism forms the backbone of neural network training.

5. Neural Networks and nn.Module

All neural networks in PyTorch are created by subclassing torch.nn.Module.

import torch.nn as nn class SimpleNet(nn.Module): def __init__(self): super().__init__() self.fc = nn.Linear(10, 1) def forward(self, x): return self.fc(x)

The forward() method defines the computation flow.

6. Loss Functions and Optimizers

Loss functions measure model error, while optimizers update model weights.

criterion = nn.MSELoss() optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

Common losses include MSE, CrossEntropy, and BCE.

7. Training Loop (Most Important Concept)

The training loop performs forward pass, loss computation, backpropagation, and parameter updates.

for epoch in range(100): optimizer.zero_grad() outputs = model(x_train) loss = criterion(outputs, y_train) loss.backward() optimizer.step()

8. Datasets and DataLoaders

DataLoaders handle batching, shuffling, and efficient data loading.

from torch.utils.data import DataLoader, TensorDataset dataset = TensorDataset(x_train, y_train) loader = DataLoader(dataset, batch_size=32, shuffle=True)

9. Model Evaluation & Inference

During evaluation, training-specific layers like Dropout must be disabled.

model.eval() with torch.no_grad(): predictions = model(x_test)

10. Regularization Techniques

Regularization prevents overfitting and improves generalization.

Dropout

nn.Dropout(p=0.5)

L2 Regularization

optimizer = torch.optim.Adam( model.parameters(), lr=0.001, weight_decay=1e-4 )

11. Weight Initialization

Proper initialization prevents vanishing and exploding gradients.

nn.init.kaiming_normal_(model.fc.weight)

12. Learning Rate Scheduling

Schedulers dynamically adjust learning rate to improve convergence.

scheduler = torch.optim.lr_scheduler.StepLR( optimizer, step_size=10, gamma=0.1 ) scheduler.step()

13. Convolutional Neural Networks (CNNs)

CNNs are designed for image data using convolutional filters.

class CNN(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(1, 32, 3) self.fc = nn.Linear(32 * 26 * 26, 10) def forward(self, x): x = torch.relu(self.conv(x)) x = x.view(x.size(0), -1) return self.fc(x)

14. Transfer Learning

Transfer learning reuses pre-trained models to reduce training time and data requirements.

from torchvision import models model = models.resnet18(pretrained=True) for param in model.parameters(): param.requires_grad = False

15. Gradient Clipping

Gradient clipping prevents exploding gradients in deep networks.

torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)

16. Mixed Precision Training

Mixed precision improves training speed and reduces memory usage.

from torch.cuda.amp import autocast, GradScaler scaler = GradScaler()

17. Saving and Loading Models

Saving models allows reuse and deployment.

torch.save(model.state_dict(), "model.pth") model.load_state_dict(torch.load("model.pth"))

18. Best Practices

🎯 Conclusion: Mastering these PyTorch concepts enables you to build production-ready deep learning systems.